INEX Tweet Contextualization Track at CLEF 2012: Query Reformulation using Terminological Patterns and Automatic Summarization

نویسندگان

  • Jorge Vivaldi
  • Iria da Cunha
چکیده

The tweet contextualization INEX task at CLEF 2012 consists of the developing of a system that, given a tweet, can provide some context about the subject of the tweet, in order to help the reader to understand it. This context should take the form of a readable summary, not exceeding 500 words, composed of passages from a provided Wikipedia corpus. Our general approach to get this objective is the following: we perform some automatic reformulations of the initial tweets provided for the task (obtaining a list of terms related with the main topic of all them using terminological patterns). Then, using these reformulated tweets, we obtain related documents with the search engine Indri. Finally, we use REG, an automatic extractive summarization system based on graphs, to summarize these documents and provide the summary associated to each tweet.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Method for Short Message Contextualization: Experiments at CLEF/INEX

This paper presents the approach we developed for automatic multi-document summarization applied to short message contextualization, in particular to tweet contextualization. The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring. In contrast to previous research, we introduced an algorithm from smoothing from the local context. Our app...

متن کامل

Tweet Contextualization using Continuous Space Vectors: Automatic Summarization of Cultural Documents

In this paper we describe our participation in the INEX 2016 Tweet Contextualization track. The tweet contextualization process aims at generating a short summary from Wikipedia documents related to the tweet. In our approach, we analyzed tweets and created a query to retrieve the most relevant Wikipedia article. We combine Information Retrieval and Automatic Text Summarization methods to gener...

متن کامل

LIA/LINA at the INEX 2012 Tweet Contextualization track

In this paper we describe our participation in the INEX 2012 Tweet Contextualization track and present our contributions. We combined Information Retrieval, Automatic Summarization and Topic Modeling techniques to provide the context of each tweet. We first formulate a specific query using hashtags and important words in the Tweets to retrieve the most relevant Wikipedia articles. Then, we segm...

متن کامل

A Hybrid Tweet Contextualization System using IR and Summarization

The article presents the experiments carried out as part of the participation in the Tweet Contextualization (TC) track of INEX 2012. We have submitted three runs. The INEX TC task has two main sub tasks, Focused IR and Automatic Summarization. In the Focused IR system, we first preprocess the Wikipedia documents and then index them using Nutch with NE field. Stop words are removed and all NEs ...

متن کامل

Three Statistical Summarizers at CLEF-INEX 2013 Tweet Contextualization Track

According to the organizers, the objective of the 2014 CLEFINEX Tweet Contextualization Task is: “...The Tweet Contextualization aims at providing automatically information a summary that explains the tweet. This requires combining multiple types of processing from information retrieval to multi-document summarization including entity linking.” We present three statistical summarizer systems ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012